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Abstract — A water recycling system (WRS) deployed at NASA 
Ames Research Center’s Sustainability Base (an energy efficient 
office building that integrates some novel technologies developed 
for space applications) will serve as a testbed for long duration 
testing of next generation spacecraft water recycling systems 
for future human spaceflight missions. This system cleans 
graywater (waste water collected from sinks and showers) and 
recycles it into clean water. Like all engineered systems, the 
WRS is prone to standard degradation due to regular use, as 
well as other faults. Diagnostic and prognostic applications will 
be deployed on the WRS to ensure its safe, efficient, and correct 
operation. The diagnostic and prognostic results can be used 
to enable condition-based maintenance to avoid unplanned out- 
ages, and perhaps extend the useful life of the WRS. Diagnosis 
involves detecting when a fault occurs, isolating the root cause 
of the fault, and identifying the extent of damage. Prognosis 
involves predicting when the system will reach its end of life 
irrespective of whether an abnormal condition is present or 
not. In this paper, first, we develop a physics model of both 
nominal and faulty system behavior of the WRS. Then, we apply 
an integrated model-based diagnosis and prognosis framework 
to the simulation model of the WRS for several different fault 
scenarios to detect, isolate, and identify faults, and predict the 
end of life in each fault scenario, and present the experimental 
results. 
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1. Introduction 

The ability to recycle potable water from waste water is an 
integral part of the Environmental Control and Life Support 
System (ECLSS) of human-rated space missions. Several 
water recycling systems (WRSs) have been tested and de- 
ployed by NASA in the past, such as the Advanced Water 
Recovery System (AWRS) designed and built at the NASA 
Johnson Space Center (JSC) as part of the Advanced Life 
Support System [1], and the Direct Osmotic Concentration 
(DOC) System [2], currently undergoing performance testing 
at JSC. 
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The WRS [3] deployed at NASA Ames Research Center’s 
Sustainability Base [4] - a Leadership in Energy and En- 
vironmental Design (LEED) certified energy efficient office 
building built to, among other things, put cutting-age space 
technologies to work on Earth - has been developed to serve 
as a testbed for long duration testing of next generation 
spacecraft water recycling systems. This system cleans gray- 
water (human waste water collected from sinks and showers) 
and recycles it into clean water to be used as flush water 
in the Sustainability Base with the goal of reducing the 
water comsumption of the building by 60%. The WRS is 
mainly comprised of a forward osmosis (FO) system and a 
reverse osmosis (RO) system. In the FO system, the gray- 
water is separated from saltwater through semi-permeable 
membranes, and water moves through the semi-permeable 
membranes from a region of higher water chemical potential 
(i.e., graywater) to a region of lower water chemical potential 
(i.e., saltwater). In the RO system, hydraulic pressure is 
applied to the (now dilute) saltwater to force water from a 
region of lower water chemical potential (i.e., saltwater) to a 
region of higher water chemical potential (i.e., clean product 
water) through another set of semi-permeable membranes, 
thereby extracting clean water. 

The WRS is a complex hydraulic system with a large number 
of components. Complex engineered systems are subject to 
degradation even in regular use (as well as the possibility 
of incurring faults) and the WRS is no exception. Hence, 
diagnosis and prognosis applications will increasingly be 
implemented on future engineered systems to ensure their 
safe, efficient, and correct operation. The diagnostic and 
prognostic results can be used to enable condition-based 
maintenance to avoid unplanned outages, and perhaps extend 
the useful life of the system. Diagnosis involves detecting 
when a fault occurs, isolating the root cause of the fault, 
and identifying the extent of damage. Prognosis involves 
prediction of when the system will reach its end of (useful) 
life so that mitigating actions may be implemented. 

In this paper, we apply a model-based diagnosis and progno- 
sis framework [5] on the WRS. We generate a physics model 
of the nominal and faulty system behavior that captures the 
dynamics of the WRS in the hydraulic domain, as well 
as the concentration of solute in the system. Faults are 
modeled as unexpected changes in the system parameters. 
We assume the presence of only single, persistent faults but 
allow faults of different fault magnitudes. As the system 
operates, the observed measurements are compared to esti- 
mates of nominal measurements obtained from the nominal 
system model, and a statistically significant measurement 
deviation from nominal results in a fault to be detected. Then, 
as measurements deviate, the observed measurement devia- 
tions are compared to predictions of how each measurement 
should deviate given particular faults, and any fault that is 
inconsistent with the observed measurement deviations is 
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Figure 1 . Schematic of the complete Water Recycling System. 


removed from consideration. For fault identification, once 
the number of fault candidates is reduced to less than a 
predefined number, for each fault candidate, a hypothesized 
fault model for that particular fault candidate is generated, 
and joint state-parameter estimation is performed [6], For 
prognosis, the end of life of the system is predicted, using, for 
each hypothesized fault candidate, a predictor based on a fault 
progression model integrated with the nominal model [7], 
Finally, we present results of several diagnosis and prognosis 
experiments performed on the simulation model of the WRS. 

The paper is organized as follows. Section 2 presents the 
nominal and faulty system model of the WRS. Section 3 
describes the diagnosis and prognosis approach used in this 
work. Experimental results are presented in Section 4, and 
Section 5 concludes the paper. 


2. Modeling the Water Recycling 
System 

The WRS installed at the Sustainablity Base at NASA Ames 
Research Center uses osmosis for generating clean water 
from waste water. As mentioned earlier, the WRS is mainly 
comprised of a forward osmosis (FO) module and a reverse 
osmosis (RO) module. FO is the movement of solvent 
molecules (in our case, water) across a semi-permeable 
membrane from a region of higher water chemical potential 
(usually called the feed solution) to a region of lower water 
chemical potential (usually called the osmotic agent) [8]. RO, 
on the other hand, is the movement of solvent molecules 
across a semi-permeable membrane in the opposite direction 
of FO, i.e., from a region of lower water chemical potential 
to a region of higher water chemical potential due to the 
application of hydraulic pressure. 

Osmosis is driven by the difference in solute concentrations 
across the membrane that allows the solvent molecules to 
pass, but rejects most solute molecules and ions. The general 
equation describing water transport in FO and RO is 

J w = A(aAir - AP) (1) 

where, J w is the water flux (rate of flow of water per unit cross 
sectional area), A is the water permeability constant of the 
membrane (i.e., the measure of the transport flux of material 
through the membrane per unit driving force per unit mem- 
brane thickness), er is the reflection coefficient (i.e., measure 


of how much a membrane can “reflect” solute particles from 
passing through), A7r is the osmotic pressure differential, and 
AP is the applied (hydraulic) pressure differential. Osmotic 
pressure is the pressure that would prevent the transport of 
solvent across the membrane, when applied to the more 
concentrated solution. The driving force in FO is the osmotic 
pressure differential across the membrane (A-7r), while in RO, 
the applied hydraulic pressure differential (AP) that opposes 
and exceeds the osmotic pressure differential to force water 
from a region of lower water chemical potential to a region 
of higher water chemical potential across the membrane. The 
hydraulic pressure is generated by pumps that are responsible 
for maintaining the needed pressure differential. Therefore, 
in Eqn. 1, AP ss 0 for FO and AP > Ai r for RO. 

Fig. 1 presents a schematic of the WRS, which consists of 
several tanks, pumps, pipes, filters, and the FO and RO 
modules. During nominal operation, first. Pump 1 is switched 
on to pump water from the Waste Water Tank into Feed Tank 
1 till the latter is full. Then, Pump 1 shuts off and Pump 2 
is turned on to fill Feed Tank 2. Filter 1 between Pump 2 
and Feed Tank 2 traps suspended solids in the feed solution 
and prevents them from entering Feed Tank 2. Pump 2 runs 
till Feed Tank 2 is full. Pumps 5 and 6 are small diaphragm 
metering pumps that are turned on periodically to add anti- 
scale chemicals (from the Antiscale Supply Tank) to the feed 
and adjust its pH (by adding chemicals from the pH Adjust 
Tank), respectively. Then Pump 4 is powered on to recirculate 
the feed water through Filter 2, and the FO module back to 
the Feed Tank 2. The osmotic agent is stored in the Osmotic 
Agent (OA) Tank. The OA in the WRS is a salt (NaCl) 
solution. The concentration of OA determines the rate of flow 
of water. The goal is to maintain this flow at approximately 
155 Lh 1 . However, during the nominal operation of the 
WRS, some NaCl is lost through the membranes. Hence, 
additional NaCl is added to the OA to maintain the flow of 
water through the membrane. The initial concentration of OA 
is 10 gL _1 , but the controller can add up to 20 gL -1 of addi- 
tional NaCl solution to the OA from the NaCl Supply Tank. 
The RO module applies an external pressure to maintain the 
flow of water through the RO membrane to approx 155 Lh -1 . 
The Reverse Osmosis (RO) pump recirculates the diluted OA 
between the RO and the FO modules. Clean water from the 
RO Module is collected in the Product Tank. The WRS is 
operated in a semi-batch mode, with no extra feed added to 
Feed Tank 2 once the FO and RO modules are started till 95% 
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Figure 2. Schematic of the subset of Water Recycling System and sensors for experiments. 


of clean water is recovered from the feed water through FO 
and RO, after which, the remaining waste is disposed. 

In this paper, we apply our diagnosis and prognosis scheme 
to a subset of the WRS, as shown in Fig. 2. This subset 
consists of all components of the complete WRS except the 
Antiscale Supply Tank, pH Adjust Tank, the NaCl Supply 
Tank, and Pumps 5 — 7. These pumps are only on for short 
durations before the FO and RO modules are activated, and 
omitting these and the associated tanks does not adversely 
alter the main dynamics of the WRS. Note that in Fig. 2, the 
Osmotic Agent Tank is also not considered, and instead, the 
OA, i.e., NaCl, is assumed to be added directly in the FO-RO 
recirculation path. Moreover, Pump 8 is not turned on during 
the simulation, and it is also omitted in Fig. 2. 

Nominal Modeling 

We develop the nominal system model for the WRS using the 
state space formulation: 

x(f) = f(t,x(t),0(t),u(f),v(t)) (2) 

y it) = h(f, x(t), 0(f), u(f), n(f)), (3) 

where lei denotes continuous time, x(£) G K nx is the state 
vector, 0{t ) G R™ 8 is the parameter vector, u(£) G R"“ is the 
input vector, v(f) G R" 11 is the process noise vector, f is the 
state equation, y (£) G R”« is the output vector, n(£) G R n " 
is the measurement noise vector, and h is the output equation. 
The parameters 0(f) are typically considered as constants in 
the nominal system model. 

Our physics-based lumped-parameter model of the WRS 
represents its hydraulic dynamics. In the hydraulic domain, 
we denote volumetric flow rate as q and hydraulic pressure as 
p. The pressures are the state variables in our model. As 
shown in Fig. 2, the pressures at the bottom of the Waste 
Water Tank, Feed Tank 1, Feed Tank 2, and Product Tank 
are denoted by pwt, Pftl Pft 2 , and p Pro d. respectively. The 
FO (resp. RO) module is modeled as two tanks, FOl and 
F02 (resp. ROl and R02) 2 , with pressures p F0 1 and p F 02 
(resp. proi and Proi). are separated by the FO (resp. RO) 
membrane, and p F0 (resp. p R0 ) denotes the volumetric flow 
rate of water across the FO (resp. RO) membrane. The pipe 
segments Pipe 1 and Pipe 2, between Pump 2 and Filter 1, and 
Pump 4 and Filter 2, respectively, are also modeled as very 
small tanks, with pressures, p P ; pe i and ppi pe 2 - respectively. 

The outflow rate of Pump 1, Pump 2, Pump 4, and RO 
Pump are denoted by gpumpi, ®>ump2, <?p U mp4, and gROPump, 
respectively. The flow through Filter 1 and Filter 2 are 
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denoted by g P ii t i and < 7 Fiit 2 , respectively, and g Pro d denotes the 
rate of flow of clean water into the Product Tank from the 
RO Module. gFoiFT 2 denotes the flow of water from the FO 
module to Feed Tank 2 , and < 7 roifo 2 denotes the flow of water 
from the RO module to the FO module. 

A Pump j installed between two points with pressures q, 
and q 3 , respectively, is modeled to boost the pressure at its 
input by its boost pressure p Pump j, he., the pressure difference 
between these two points with the pump in between is p, + 
Ppumpj ^ Pj ■ The boost pressure p Pump j is considered as an 
input to the system. The boost pressures for Pump 1, Pump 2, 
Pump 4, and RO Pump are denoted by p Pump i, p Pump2 , Pp U m P 4 , 
and pROPump, respectively. 

Given two points in a hydraulic system, with pressures p, and 
Pj, the volumetric flow rate of fluid between these two points 
is 


qi j = Ri j \J | Pi - Pj | sign {pi-pj), (4) 

where R %3 is the coefficient of flow for q, 3 . For a Tank i 
having input and output flow rates, g; n and g out , pressure px an ki 
is 


PTanki — 


CVanki 


(/I in 1/oul ) , 


where CVanki is the tank capacitance. 


(5) 


In addition to the hydraulic dynamics, we also model the 
reduction of solute molecules in the OA over time. To this 
end, the amount of NaCl in the OA, :r N aCb is considered a 
state variable. As mentioned before, we start with 10 gL -1 
of NaCl in the OA. During nominal operation of the WRS, 
some NaCl is lost through the membranes (we assume the 
rate of loss of salt to be —1.11 x 10 _5 gL _1 s _1 ). Now, the 
osmotic potential An is directly proportional to the difference 
in concentration on the two sides of the semi-permeable 
membrane. Also, the goal of the controller is to maintain the 
flow through the FO membrane at approximately 155 Lh 1 . 
To maintain the osmotic pressure difference, and hence, the 
rate of flow of water through the FO membrane, the controller 
adds additional amounts of NaCl, represented by A XNaC1 , to 
the OA. However, the total amount of NaCl in the OA cannot 
be more than 30 gL - 1 , and hence the maximum value of 
A XNaC1 can be 20 gL -1 . This A XNaC1 also affects the flow of 
water through the RO membrane. 


Fig. 3 lists the equations that model the WRS, where the state 
variables include x = [pwis Pftl PPipei, Pfti, PPi pe 2 , Pfol 
PF02, Proi- PR02, PProd, ZN a ci] T ; the output variables include 
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y — [?Pump2, <?Filtl, <fFilt2, ®>ump4, <ZROPump, 7Prod- PWT, PFTU 

Pft2, PProd, PFiiti, PFiit2] T : and the input variables include 

U = [upumpl, Wp um p 2 , 2 /p UIn p 4 , ttROPump, tXFO, tiRO ■ The 

input signals switch the corresponding pump on or off. The 
input signals ufo and uro basically indicate when the FO 
and RO modules are filled with feed and osmotic agent (we 
assume there is no water in any of the tanks or plumbings 
at the start of the simulation), and hence, FO and RO can 
begin, respectively. All flows are expressed in Lh 1 and all 
pressures are expressed in psi. 

Modeling of Faulted System 

Typical degradation modes of the WRS include clogged 
membranes, clogged filters, and sensor faults. In particular. 
Filter 1, Filter 2, the FO membrane, and the RO membrane 
all get clogged over time due to buildup of solids. These 
clogging faults, denoted by Rf m , A“ 0 , and A~ 0 , 

respectively, are represented as gradual decrease in the co- 
efficient of flow through the filters, f?Fiiti and 7?Fiit2> and the 
membrane permeabilities Apo and Aro, respectively. A fault 
can then be modeled as an unexpected change in a system 
parameter. For Filter i, the gradual decrease in lltAu is 
represented as 


a = f°, t < tf 

Fllti otherwise 


(31) 


where t f is the time of fault occurrence, and A/(pii ti is the 
fault parameter. Similarly, for membrane j, the gradual 
decrease in .4 , is represented as 


Ai = { a’ a * u tf ■ (32) 

(AA^, otherwise 

where AA, is the fault parameter. 

Sensors faults can include abrupt bias and gradual drift fault. 
A bias fault in sensor S is indicated as S^ b,AS \ anc | j s 
modeled as an abrupt addition of a constant bias b added to 
the sensor value from the point of fault injection tf, i.e., 


f S', t < t f 

[S + AS, otherwise. 


(33) 


A drift fault in sensor S is indicated as S^ d,AS \ and is 
modeled as a gradual addition of a constant drift d to the 
sensor value at each time step from the point of fault injection 
t f , i.e.. 


takes in the observed and estimated measurements, y (k) and 
y(fc), and detects when a fault has occurred based on the 
residual, r(fc) = y (fc) — y (k). Once a fault is detected, 
fault isolation is initiated. The fault isolation block takes as 
inputs r (k). These measurement residuals are used along 
with predictions of how each measurement is expected to 
deviate from nominal for each possible fault in the system 
to generate a set of fault candidates F(k) at time k that 
explain the observed deviations in measurements till time k. 
The fault identification module, for each fault, / £ F(k), 
estimates p(xf(k), 9 f(k)\y{0:k)), where Xf represents the 
set of state variables in the faulty system model that includes 
all state variables of the nominal model and the faulty system 
parameter corresponding to the particular / £ F(k) that 
needs to be estimated. 9 f represents the set of all original sys- 
tem parameter except those that are now included in x f and 
includes some additional fault progression model parameters 
that are used to model how the faulty parameter progresses 
over time (see [5] for details). Finally, the prediction module 
takes as input p(xf(k), 9f(k) |y (0 : k )) to make predictions of 
End of Life (EOL), i.e., p(EOL/(fc)|y(0 : k)), and Remaining 
Useful Life (RUL), i.e., p(RUL f (k)\y(0:k)) [7], 

A system is said to have reached its EOL when one or more 
constraints that define the acceptable behavior of the system 
is violated. For each faulty system model, we define a 
threshold function, Teol/, where Teol/ (xy(f), 9 f(t)) = lif 
these constraints are violated, and Teol/ ( x/(f), 0/(f)) = 0 
otherwise. So, EOLy may be defined as EOL f{tp) = 
inf{f £ R : t > tp and T EO L / (x/(f), 0/(f)) = 1}, 
i.e., EOL is the earliest time point at which the threshold is 
reached. Given EOL/(tp), RUL may then be defined with 

RULy (ip) = EOL f(tp) — tp. The remainder of this section 
describes the details of the different modules of the integrated 
diagnosis and prognosis architecture. 

Nominal Model Observer 

The nominal model observer typically takes as inputs the 
system inputs, u(fc), and measurements, y(0:fc), and the 
initial state of the system, and uses the state transition 
function, f(-), and observation function, h(-), to estimate 
distributions of states, x(k), and parameters, 9(k), i.e., 
p(x(k),9(k)\y(0:k)). Any appropriate filtering scheme, 
e.g., Kalman filter, extended Kalman filter, unscented Kalman 
filter, particle filter [9], among others, can be adopted as the 
nominal observer. Note that in this paper, a high fidelity 
simulation model of the nominal WRS system developed 
using the equations shown in Fig. 3 is used in place of the 
nominal observer to simulate the nominal system behavior 
given the inputs u and initial state of the system. 


/ 0, t < tf 

(AS) otherwise. 


(34) 


(b, 20) 


The sensor faults considered in this paper include r/ H||| 

(d, 0.1) (6,2) H (d, 0.01) 

^Filtl > Pprod ’ anC1 ^ROPump ' 


3. Diagnosis and Prognosis Approach 

Fig. 4 illustrates the architecture of our diagnostic and prog- 
nostic approach, which is adopted from that presented in [5], 
At each discrete time step, k, the system takes as inputs 
u (k), and outputs measurements y(k). The nominal model 
observer also takes as inputs u(fc), and generates estimates 
of nominal measurements, y(k). The fault detector then 


Fault Detection 

A fault is detected when a residual, r(k) £ r (k), i.e., 
the difference between the observed (faulty) and estimated 
(nominal) values of a measurement, is determined to be 
statistically significant [10]. In our work, we use a Z - test 
coupled with a sliding window technique to determine this 
statistical significance [10], Fault detectors need to be tuned 
so as to minimize false alarms and missed detections while 
maintaining the desired level of sensitivity. 

Fault Isolation 

Once a fault is detected, at each subsequent time step, every 
measurement residual is qualitatively abstracted into a tuple 
of qualitative symbols, (< 71 , 02 ), where a\ £ {0,+,—} rep- 
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Figure 3. Equations of the nominal WRS model. 


{p(x f (k),8f(k)\y(0:k)):f £ F(k)} 



Figure 4. Diagnosis and Prognosis Architecture. 


resents the qualitative magnitude change, and a -2 £ {0, +, — } 
represents the qualitative slope change. The symbols, 0, 
+, or — , denote whether the magnitude or slope of this 
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measurement is at, above, or below nominal, respectively. 
The symbols are generated using a sliding window technique 
as described in detail in [10]. 




Based on the first observed statistically significant measure- 
ment deviation, we generate a set of possible fault candidates. 
Then, for each fault candidate, we systematically determine a 
fault signature for each measurement [11], A fault signature 
of a fault for a measurement is a prediction of how the 
measurement will deviate from nominal due to the fault. Fault 
signatures are also of the form (si, S 2 ). where si £ {0, +, — } 
and S 2 £ {0, +, — } capture qualitatively the direction of 
change to be expected in the magnitude and slope of each 
measurement from nominal if the fault occurs. 

Given the set of fault candidates, as measurements deviate 
from nominal, the observed measurement deviations (cap- 
tured symbolically) are checked for consistency with pre- 
dicted fault signatures and measurement orderings. Any fault 
candidate whose predictions are inconsistent is removed from 
consideration. As more and more measurement deviations are 
observed, the candidate set will reduce, ideally resulting in a 
singleton. 

However, in some cases, the qualitative fault signatures alone 
are not sufficient in distinguishing all faults, or fault effects 
may take too long to manifest, and quantitative analysis is 
needed to correctly diagnose the true fault. The advantage 
of using qualitative fault isolation is that it reduces the fault 
candidates very quickly, thereby improving the scalability 
of the overall diagnosis task. Hence, the more diagnosable 
the system is, the smaller is the number of possible fault 
candidates remaining after fault isolation is performed, and 
fewer will be the faults that will have to be isolated through 
relatively (computationally) expensive quantitative methods. 

Fault Identification 

We initiate quantitative fault identification after qualitative 
fault signature-based isolation is executed for p time steps 
or till the number of fault candidates reduces to less than < 7 , 
whichever is achieved first. The design parameters p and a 
are chosen based on the design requirements of the integrated 
diagnostic and prognostic system. 

Once fault identification is invoked, under the single fault 
assumption, for each remaining fault candidate, /, we instan- 
tiate an observer using its faulty system model by extending 
the nominal system model with the fault progression model. 
Then each fault observer tracks the observed system mea- 
surements independently, and generates estimates of y ( k ) 
and p(xf(k),Of(k)\y(kd — Ak max :k)), Ak max is usually 
assumed to be larger than the time difference between the 
time of fault occurrence, kf, and the time of fault detection, 
kd ■ Each fault observer is initialized to estimated values 
of x and 6 obtained from the nominal observer at time 
kd — Ak max , and the fault parameters are initialized to zero. 
If multiple fault candidates remain when fault identification 
is invoked, for each fault observer, a Z-test is used to 
determine if the deviation of a measurement estimated by 
the observer from the corresponding actual observation is 
statistically significant. Since we are considering only single 
faults, the expectation is that eventually, the estimates of 
only the correct fault observer will converge to the observed 
measurements, while those of all others will deviate from the 
observed measurements. Thus fault identification also helps 
in fault isolation. Practically, even the true fault model will 
take some time before tracking the measurements correctly, 
since initially, the fault parameter values are most likely to 
be not tuned to their true values. We assume that the true 
fault observer will converge to the observed measurements 
within Sd time steps of its invocation. Thus, the Z -tests are 
monitored only after Sd time steps are over [6], 


Algorithm 1 EOL Prediction 

Inputs: {( x t f (k P ),0 1 f (kp)),w l (kp)}^ =1 
Outputs: {EOL l f(kp),w t (kp)}fL 1 

for i = 1 to TV do 

k 4 — tp 

X /W x /(^p) 

0j(fc) «- eyikp) 
while T E0L/ (x^(fc), 0}(fc)) = 0 do 
Predict u(fc) 

0}(fc + 1) ~ p(O f (k + 1)| 0}(fc)) 

x^(fc + 1) ~ p(x.f(k + l)|x^(fc), 9 l fik), u (fc)) 

k <— k + 1 

x^(fc) <— x^.(fc + 1) 

0)(fc)+-0)(fc + l) 

end while 

EOL*.(fc P ) <- k 

end for 


Prediction 

The prediction module is invoked at time kp to predict 
the EOL and/or RUL of the component for each hypoth- 
esized fault, /. Specifically, using the current joint state- 
parameter estimate, p{xf(kp), 0f(kp)\y(0 : kp)), which 
represents the most up-to-date knowledge of the system at 
time kp, the goal is to compute p(EOL/(fcp)|y(0:A;p)) and 
p(RUL/(fcp)|y(0:/cp). As described in detail in [12], we 
assume the state-parameter distribution is represented as a 
discrete set of weighted samples, i.e., 

p(x f (kp),df(kp)\y(0:kp)) « 

N 

^2 w\kp)8 ( ^ t(kp)fi i f ( kp)) {dx t {kp)dd f {k P )), 

i-t 

where i denotes the index of a single sample, w l is the weight 
of this sample, and 5 represents the Dirac delta function 
located at (xj(fcp), 9){kp)). 

Similarly, we can approximate the EOL as 

p(EOL/(fcp)|y(0:fcp) « 

N 

w \ kp ) 5 eol) (k P ) ( dEOL f {k P )). 

i= 1 


The general approach to solving the prediction problem 
is through simulation. Each sample is simulated forward 
to EOL to obtain the complete EOL distribution. The 
pseudocode for the prediction procedure is given as Algo- 
rithm 1 [7], Each sample i in the state-parameter distribution 
is propagated forward until Teol/ (x^(fc), 0}(fc)) evaluates to 
1, at which point EOL has been reached for this particle, and 
the EOL prediction is weighted by the weight of the sample 
at kp. 

Note that we need to hypothesize future inputs of the system, 
u (k), for prediction, since fault progression is dependent 
on the operational conditions of the system. The choice of 
expected future inputs depends on the knowledge of expected 
operational settings. 
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Table 1. Fault signatures for selected faults and measurements. 


Faults 

Q'Pumpl 

gFiltl 

<7Filt2 

<7Pump4 

9ROPump 

IJProd 

PWT 

PFT1 

PFT2 

PProd 

PFiltl 

PFilt2 

^Filtl 

0- 

0- 

0- 

0- 

00 

00 

00 

0+ 

0- 

00 

0+ 

0- 

^Filt2 

00 

00 

0- 

0- 

00 

00 

00 

00 

0- 

00 

00 

0+ 


00 

0- 

0- 

0- 

0- 

00 

00 

00 

0+ 

00 

00 

0- 

AtO 

00 

00 

00 

00 

0- 

0- 

00 

00 

00 

0- 

00 

00 

Jb, 20) 
”Filtl 

00 

+0 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

(d,0.1) 

%ilt2 

00 

00 

0+ 

00 

00 

00 

00 

00 

00 

00 

00 

00 

(6,2) 

Pftod 

00 

00 

00 

00 

00 

00 

00 

00 

00 

+0 

00 

00 

(d,0.01) 

< ?ROPump 

00 

00 

00 

00 

0+ 

00 

00 

00 

00 

00 

00 

00 


4. Experimental Results 

This section presents the results of our diagnosis and progno- 
sis experiments on the simulation model of the WRS shown in 
Fig. 2. For these experiments, as mentioned in Section 2, we 
selected eight different faults, namely i?p iltl , i?p ilt2 , Ap 0 , ^ro- 

4mi 0) > r’-Pw’ and 4oPump- Table 1 provides the fault 
signature table for the selected faults and measurements of the 
WRS. Note that sensor faults affect only the signature for the 
faulty sensor. Parametric faults such as the clogging of filters 
and membranes cause more than one sensor to deviate from 
nominal. 

For the purposes of prognosis, the EOL of the WRS is defined 
by when the filters need to be replaced. This is indicated by 
when the differential pressures across the individual filters, 
PFiiti °r PFiit2 cross a certain pressure threshold, p F iitA ° r 
PFiit2^- Flence, T E ol/ = 1 if PFiiti > PFiiti 1 or p E n t 2 > PFiit2^- 

In our experiments, for fault detection, we use the simulation 
model of the nominal system to generate nominal system 
behavior. The fault signatures for faults considered in our 
experiments and the WRS measurements are given in Table 1, 
and used for fault isolation. For fault identification, we 
adopt particle filtering [9] as our observer. Particle filtering 
is the most general estimation scheme as it can be applied 
to nonlinear systems with arbitrary probability distributions 
for process and measurement noise that can be nonlinearly 
coupled with the states. Particle filtering is a sequential 
Monte Carlo sampling method for Bayesian filtering and 
approximates the belief state of a system using a weighted 
set of samples, or particles. Each particle consists of an 
instantiation of values of the state vector, and describes a 
possible system state. As observations are obtained, each par- 
ticle is moved stochastically to a new state using the nominal 
state transition function, and the weight of each particle is 
readjusted to reflect the likelihood of that observation given 
the particle’s new state. We assume all random variables to 
be Gaussian. 

We now present a detailed integrated diagnosis and prognosis 
scenario to illustrate our approach. In this scenario. Filter 
2 clogging begins at t = 0 min according to Eqn. 31 with 
wear rate Af?Fiit 2 = — 5 x 10~ 12 . A fault is detected at 
309 min, via an increase in the differential Filter 2 pressure, 
PFiit2 (see Fig. 5). As shown in Table 1, only fault f?p lt2 has 
a 0+ signature for pFiit 2 , indicating that the fault Rp at2 would 
cause the pressure PFiit 2 to increase. Since this is the only 
fault consistent with the observed deviation, a singleton fault 



Figure 5. Estimated and observed values of sensor p E iit 2 - 


candidate set, {7? Fi i t2 }, is generated, and the fault is detected 
and isolated at the same time. 

Fault identification is initiated once the number of fault 
candidates was reduced to three or less (i.e., cr = 3) by the 
qualitative isolator, or if the qualitative isolator has executed 
for p = 400 min. For our particular problem, we found 
N = 50 particles sufficient for accurate tracking, and used 
Ak max = 0 for each observer used for fault identification. 
For the Filter 2 Clogging fault, the wear rate Ai?piit2 estimate 
averages to Af?piit2 = —5.11759 x 10“ 12 with small output 
error (see Fig. 6). The corresponding RUL predictions, made 
at an interval of 10 min from the time the fault identifier 
converges to a solution are shown in Fig. 7 which plots 
the predicted RUL [13] of the WRS under i?^ lt2 from t = 
540 min at 10 min intervals. As mentioned in Section 3, at 
each prediction point. Fig. 7 shows true RUL, RUL*, and a 
probability density function of the predicted RUL represented 
using its median value and the 5 — 25% and 75 — 95% ranges. 
The plot also shows a cone of a = 10% accuracy around RUL 
predictions. From the first prediction point, at t = 540 min, 
the algorithm has converged and the median RUL predictions 
remain within the accuracy window of 10% except at t. = 
610 min, t = 620 min, and t = 640 min. In order to make 
predictions, we assume that the future inputs are known. 
Hence, the uncertainty in the predictions is due solely to that 
resulting from the identification stage, and explains why all 
RUL predictions did not fall within the accuracy cone. In our 
simulation experiments, for illustrative purposes, we chose 


7 




0 


Table 2. Diagnosis Results 


x 10 -11 



Figure 6. Estimated values. 


6, 80 H 
5 


i I 


- -RUL* 

[(1 - a) RUL*, (1 + a)RUL*) 
• Median RU L Prediction 
5-25% and 75-95% Ranges 


K i 

i'ki 


540 560 580 600 620 640 660 

Time (minutes) 


Figure 7. Predicted RUL of the WRS under Ry M1 fault. The 
median is indicated with a dot and confidence intervals of 
5% and 95% by lines. The gray cone depicts an accuracy 
requirement of a =10%. 


a = 10%. In real-world scenarios, however, the value of a 
flows down from the top-level requirements [14], 

Simulation Results 

Table 2 summarizes the detection and isolation results of 
several simulation experiments. The columns of the table 
represent the true fault; true injected value of the fault pa- 
rameter; tf, the time of fault occurrence in minutes from 
the start of experiment; At d , the time in minutes to detect 
the fault; A L, the time in minutes for qualitative isolation 
to reduce the candidate set as much as possible; and the set 
of fault candidates after qualitative fault isolation. Given 
the small number of faults, in each of the experiments, the 
observed measurement deviation resulted in a singleton fault 
candidate set to be generated (with the true fault being the 
only fault candidate). As a result the fault was detected 
and isolated at the same time, and hence. At, = At a for 
these experiments. Note that this is typically not the case in 
large systems with many possible faults, where more than one 
measurement deviation is needed to isolate the true single- 
fault candidate. Once the sensor faults are correctly isolated 
and identified, the sensor readings can be “corrected”, and 
hence, the presence of this type of sensor faults do not 


True Fault 

True Fault 
Magnitude 

tf 

(min) 

A t d 
(min) 

At* 

(min) 

Fault 

Candidates 

Nominal 

N/A 

N/A 

oo 

oo 

0 

Aaltl 

-1.00 x 10“ 12 

1.00 

60.67 

60.67 

Aaltl 

^Filt2 

-5.00 x 10~ 12 

1.00 

309.37 

309.37 

^Filt2 

^FO 

-1.60 x 10 -11 

326.00 

232.70 

232.70 

Ao 

^RO 

-2.00 x 10“ 10 

326.00 

37.87 

37.87 

AtO 

(6,20) 

^Filtl 

20.00 

175.00 

0.03 

0.03 

(6,20) 

%iltl 

(a, o.i) 

%ilt2 

0.10 

400.00 

0.90 

0.90 

(d,0.1) 

%ilt2 

(6,2) 

Pprod 

2.00 

410.00 

0.02 

0.02 

(6,2) 

Pprod 

(4,0.01) 

^ROPump 

0.01 

404.00 

8.67 

8.67 

(4,0.01) 

^ROPump 


cause the system to violate the constraints of acceptable 
behavior. Hence, for sensor faults, prognosis is not applicable 
since we assumed a fault mode that manifests itself without 
measurable precursors. The prognosis results for the R^p 
have already been presented above. In our experimental runs, 
the slowly progressing filter and membrane blockage faults 
take between 37.87 min and 309.37 min to be detected. The 
sensor faults however are detected and isolated more quickly, 
between 0.02 min and 8.67 min. 


5. Conclusions 

This paper applied an integrated model-based diagnostic and 
prognostic framework to a WRS designed to serve as a 
testbed for long duration testing of next generation spacecraft 
WRS for human spaceflight missions. Our approach made 
use of a common modeling paradigm to model both the 
nominal and faulty system behavior, and we successfully 
demonstrated diagnosis and prognosis results on the WRS. 

As part of future work, we are planning to analyze the real- 
world experimental data from the WRS at the Sustainability 
Base to refine our WRS simulation model. We also plan to 
extend the model by including the modeling of the mass flow 
conservation of solute and solvent molecules. Since the WRS 
qualifies as a complex system, improvements in efficiency 
and scalability can be achieved by running distributed diag- 
nosis and prognosis algorithms on this system [15]. Finally, 
we will investigate the effect of relaxing the single fault as- 
sumption and extend our approach to diagnosis and prognosis 
of multiple faults in the WRS. 
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